Overview

Dataset statistics

Number of variables16
Number of observations50000
Missing cells0
Missing cells (%)0.0%
Duplicate rows90
Duplicate rows (%)0.2%
Total size in memory5.8 MiB
Average record size in memory121.0 B

Variable types

Numeric10
Boolean1
Categorical5

Warnings

month signup_date has constant value "1" Constant
Dataset has 90 (0.2%) duplicate rows Duplicates
date_delta is highly correlated with month last_trip_dateHigh correlation
month last_trip_date is highly correlated with date_deltaHigh correlation
month signup_date is highly correlated with luxury_car_user and 4 other fieldsHigh correlation
luxury_car_user is highly correlated with month signup_dateHigh correlation
city_2 is highly correlated with month signup_dateHigh correlation
city_0 is highly correlated with month signup_dateHigh correlation
active is highly correlated with month signup_dateHigh correlation
city_1 is highly correlated with month signup_dateHigh correlation
phone has 15022 (30.0%) zeros Zeros
surge_pct has 34409 (68.8%) zeros Zeros
trips_in_first_30_days has 15390 (30.8%) zeros Zeros
weekday_pct has 9203 (18.4%) zeros Zeros
date_delta has 2302 (4.6%) zeros Zeros

Reproduction

Analysis started2021-03-12 19:44:54.573245
Analysis finished2021-03-12 19:45:52.452999
Duration57.88 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

avg_dist
Real number (ℝ≥0)

Distinct2908
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7968266
Minimum0
Maximum160.96
Zeros150
Zeros (%)0.3%
Memory size390.8 KiB
2021-03-12T11:45:52.750497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.2
Q12.42
median3.88
Q36.94
95-th percentile16.78
Maximum160.96
Range160.96
Interquartile range (IQR)4.52

Descriptive statistics

Standard deviation5.707356703
Coefficient of variation (CV)0.984565711
Kurtosis29.19171296
Mean5.7968266
Median Absolute Deviation (MAD)1.82
Skewness3.464170294
Sum289841.33
Variance32.57392054
MonotocityNot monotonic
2021-03-12T11:45:53.466634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0150
 
0.3%
2.3116
 
0.2%
2.29116
 
0.2%
2.36114
 
0.2%
2.7114
 
0.2%
2.73114
 
0.2%
2.65113
 
0.2%
2.5113
 
0.2%
2.4110
 
0.2%
2.54110
 
0.2%
Other values (2898)48830
97.7%
ValueCountFrequency (%)
0150
0.3%
0.0138
 
0.1%
0.0214
 
< 0.1%
0.036
 
< 0.1%
0.0412
 
< 0.1%
ValueCountFrequency (%)
160.961
< 0.1%
129.891
< 0.1%
79.691
< 0.1%
79.341
< 0.1%
77.131
< 0.1%

avg_rating_by_driver
Real number (ℝ≥0)

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.778158196
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size390.8 KiB
2021-03-12T11:45:53.974494image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q14.7
median5
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.4457531013
Coefficient of variation (CV)0.09328973278
Kurtosis24.33824456
Mean4.778158196
Median Absolute Deviation (MAD)0
Skewness-4.137232874
Sum238907.9098
Variance0.1986958273
MonotocityNot monotonic
2021-03-12T11:45:54.542105image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
528508
57.0%
4.84537
 
9.1%
4.73330
 
6.7%
4.93094
 
6.2%
4.52424
 
4.8%
4.62078
 
4.2%
41914
 
3.8%
4.31018
 
2.0%
4.4860
 
1.7%
3602
 
1.2%
Other values (18)1635
 
3.3%
ValueCountFrequency (%)
1181
0.4%
1.54
 
< 0.1%
2126
0.3%
2.31
 
< 0.1%
2.531
 
0.1%
ValueCountFrequency (%)
528508
57.0%
4.93094
 
6.2%
4.84537
 
9.1%
4.778158196201
 
0.4%
4.73330
 
6.7%

avg_rating_of_driver
Real number (ℝ≥0)

Distinct38
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.601559291
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size390.8 KiB
2021-03-12T11:45:55.877384image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q14.5
median4.7
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.5649765903
Coefficient of variation (CV)0.1227793786
Kurtosis10.2979159
Mean4.601559291
Median Absolute Deviation (MAD)0.3
Skewness-2.653535617
Sum230077.9646
Variance0.3191985475
MonotocityNot monotonic
2021-03-12T11:45:56.648855image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
520771
41.5%
4.6015592918122
 
16.2%
44193
 
8.4%
4.52498
 
5.0%
4.82430
 
4.9%
4.71945
 
3.9%
4.91771
 
3.5%
4.31487
 
3.0%
4.61143
 
2.3%
31003
 
2.0%
Other values (28)4637
 
9.3%
ValueCountFrequency (%)
1256
0.5%
1.54
 
< 0.1%
1.61
 
< 0.1%
1.72
 
< 0.1%
1.82
 
< 0.1%
ValueCountFrequency (%)
520771
41.5%
4.91771
 
3.5%
4.82430
 
4.9%
4.71945
 
3.9%
4.6015592918122
 
16.2%

avg_surge
Real number (ℝ≥0)

Distinct115
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0747638
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size390.8 KiB
2021-03-12T11:45:57.424444image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31.05
95-th percentile1.38
Maximum8
Range7
Interquartile range (IQR)0.05

Descriptive statistics

Standard deviation0.2223360089
Coefficient of variation (CV)0.206869648
Kurtosis77.28146676
Mean1.0747638
Median Absolute Deviation (MAD)0
Skewness6.821346191
Sum53738.19
Variance0.04943330088
MonotocityNot monotonic
2021-03-12T11:45:58.049699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
134454
68.9%
1.251100
 
2.2%
1.13956
 
1.9%
1.02809
 
1.6%
1.08798
 
1.6%
1.04774
 
1.5%
1.06770
 
1.5%
1.05704
 
1.4%
1.03619
 
1.2%
1.07616
 
1.2%
Other values (105)8400
 
16.8%
ValueCountFrequency (%)
134454
68.9%
1.01484
 
1.0%
1.02809
 
1.6%
1.03619
 
1.2%
1.04774
 
1.5%
ValueCountFrequency (%)
81
 
< 0.1%
5.751
 
< 0.1%
55
< 0.1%
4.751
 
< 0.1%
4.54
< 0.1%

phone
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.69732
Minimum0
Maximum1
Zeros15022
Zeros (%)30.0%
Memory size390.8 KiB
2021-03-12T11:45:58.584786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum1
Range1
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.4578945413
Coefficient of variation (CV)0.6566490869
Kurtosis-1.251599269
Mean0.69732
Median Absolute Deviation (MAD)0
Skewness-0.8613514419
Sum34866
Variance0.2096674109
MonotocityNot monotonic
2021-03-12T11:46:00.541843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
134624
69.2%
015022
30.0%
0.8184
 
0.4%
0.6136
 
0.3%
0.432
 
0.1%
0.22
 
< 0.1%
ValueCountFrequency (%)
015022
30.0%
0.22
 
< 0.1%
0.432
 
0.1%
0.6136
 
0.3%
0.8184
 
0.4%
ValueCountFrequency (%)
134624
69.2%
0.8184
 
0.4%
0.6136
 
0.3%
0.432
 
0.1%
0.22
 
< 0.1%

surge_pct
Real number (ℝ≥0)

ZEROS

Distinct367
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.849536
Minimum0
Maximum100
Zeros34409
Zeros (%)68.8%
Memory size390.8 KiB
2021-03-12T11:46:01.778834image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38.6
95-th percentile50
Maximum100
Range100
Interquartile range (IQR)8.6

Descriptive statistics

Standard deviation19.9588109
Coefficient of variation (CV)2.255351117
Kurtosis10.43684717
Mean8.849536
Median Absolute Deviation (MAD)0
Skewness3.14412393
Sum442476.8
Variance398.3541325
MonotocityNot monotonic
2021-03-12T11:46:03.061672image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034409
68.8%
1001416
 
2.8%
501367
 
2.7%
33.31152
 
2.3%
25906
 
1.8%
20790
 
1.6%
16.7708
 
1.4%
14.3533
 
1.1%
12.5439
 
0.9%
11.1393
 
0.8%
Other values (357)7887
 
15.8%
ValueCountFrequency (%)
034409
68.8%
0.41
 
< 0.1%
0.53
 
< 0.1%
0.61
 
< 0.1%
0.75
 
< 0.1%
ValueCountFrequency (%)
1001416
2.8%
85.72
 
< 0.1%
83.33
 
< 0.1%
8011
 
< 0.1%
7534
 
0.1%

trips_in_first_30_days
Real number (ℝ≥0)

ZEROS

Distinct59
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2782
Minimum0
Maximum125
Zeros15390
Zeros (%)30.8%
Memory size390.8 KiB
2021-03-12T11:46:03.708657image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile9
Maximum125
Range125
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.792684069
Coefficient of variation (CV)1.664772219
Kurtosis56.57119678
Mean2.2782
Median Absolute Deviation (MAD)1
Skewness5.167754879
Sum113910
Variance14.38445245
MonotocityNot monotonic
2021-03-12T11:46:04.136579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
015390
30.8%
114108
28.2%
27402
14.8%
33788
 
7.6%
42562
 
5.1%
51616
 
3.2%
61134
 
2.3%
7819
 
1.6%
8589
 
1.2%
9471
 
0.9%
Other values (49)2121
 
4.2%
ValueCountFrequency (%)
015390
30.8%
114108
28.2%
27402
14.8%
33788
 
7.6%
42562
 
5.1%
ValueCountFrequency (%)
1251
< 0.1%
731
< 0.1%
711
< 0.1%
631
< 0.1%
581
< 0.1%

luxury_car_user
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size49.0 KiB
False
31146 
True
18854 
ValueCountFrequency (%)
False31146
62.3%
True18854
37.7%
2021-03-12T11:46:04.374971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

weekday_pct
Real number (ℝ≥0)

ZEROS

Distinct666
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean60.926084
Minimum0
Maximum100
Zeros9203
Zeros (%)18.4%
Memory size390.8 KiB
2021-03-12T11:46:04.581959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q133.3
median66.7
Q3100
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)66.7

Descriptive statistics

Standard deviation37.08150341
Coefficient of variation (CV)0.6086309996
Kurtosis-1.154187819
Mean60.926084
Median Absolute Deviation (MAD)33.3
Skewness-0.4777875001
Sum3046304.2
Variance1375.037895
MonotocityNot monotonic
2021-03-12T11:46:05.256189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10016659
33.3%
09203
18.4%
504057
 
8.1%
66.72088
 
4.2%
33.31619
 
3.2%
751104
 
2.2%
60772
 
1.5%
25723
 
1.4%
80668
 
1.3%
40593
 
1.2%
Other values (656)12514
25.0%
ValueCountFrequency (%)
09203
18.4%
41
 
< 0.1%
51
 
< 0.1%
5.91
 
< 0.1%
6.33
 
< 0.1%
ValueCountFrequency (%)
10016659
33.3%
991
 
< 0.1%
98.92
 
< 0.1%
98.51
 
< 0.1%
98.42
 
< 0.1%

active
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
0
31690 
1
18310 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row0
ValueCountFrequency (%)
031690
63.4%
118310
36.6%
2021-03-12T11:46:06.526345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-12T11:46:06.737510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
031690
63.4%
118310
36.6%

Most occurring characters

ValueCountFrequency (%)
031690
63.4%
118310
36.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number50000
100.0%

Most frequent character per category

ValueCountFrequency (%)
031690
63.4%
118310
36.6%

Most occurring scripts

ValueCountFrequency (%)
Common50000
100.0%

Most frequent character per script

ValueCountFrequency (%)
031690
63.4%
118310
36.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII50000
100.0%

Most frequent character per block

ValueCountFrequency (%)
031690
63.4%
118310
36.6%

city_0
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
0.0
33466 
1.0
16534 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters150000
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.033466
66.9%
1.016534
33.1%
2021-03-12T11:46:07.291112image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-12T11:46:07.478524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.033466
66.9%
1.016534
33.1%

Most occurring characters

ValueCountFrequency (%)
083466
55.6%
.50000
33.3%
116534
 
11.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100000
66.7%
Other Punctuation50000
33.3%

Most frequent character per category

ValueCountFrequency (%)
083466
83.5%
116534
 
16.5%
ValueCountFrequency (%)
.50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common150000
100.0%

Most frequent character per script

ValueCountFrequency (%)
083466
55.6%
.50000
33.3%
116534
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII150000
100.0%

Most frequent character per block

ValueCountFrequency (%)
083466
55.6%
.50000
33.3%
116534
 
11.0%

city_1
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
0.0
39870 
1.0
10130 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters150000
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row1.0
5th row0.0
ValueCountFrequency (%)
0.039870
79.7%
1.010130
 
20.3%
2021-03-12T11:46:08.200262image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-12T11:46:08.489617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.039870
79.7%
1.010130
 
20.3%

Most occurring characters

ValueCountFrequency (%)
089870
59.9%
.50000
33.3%
110130
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100000
66.7%
Other Punctuation50000
33.3%

Most frequent character per category

ValueCountFrequency (%)
089870
89.9%
110130
 
10.1%
ValueCountFrequency (%)
.50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common150000
100.0%

Most frequent character per script

ValueCountFrequency (%)
089870
59.9%
.50000
33.3%
110130
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII150000
100.0%

Most frequent character per block

ValueCountFrequency (%)
089870
59.9%
.50000
33.3%
110130
 
6.8%

city_2
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
0.0
26664 
1.0
23336 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters150000
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row1.0
ValueCountFrequency (%)
0.026664
53.3%
1.023336
46.7%
2021-03-12T11:46:08.999536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-12T11:46:09.165567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.026664
53.3%
1.023336
46.7%

Most occurring characters

ValueCountFrequency (%)
076664
51.1%
.50000
33.3%
123336
 
15.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100000
66.7%
Other Punctuation50000
33.3%

Most frequent character per category

ValueCountFrequency (%)
076664
76.7%
123336
 
23.3%
ValueCountFrequency (%)
.50000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common150000
100.0%

Most frequent character per script

ValueCountFrequency (%)
076664
51.1%
.50000
33.3%
123336
 
15.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII150000
100.0%

Most frequent character per block

ValueCountFrequency (%)
076664
51.1%
.50000
33.3%
123336
 
15.6%

date_delta
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct182
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92.7901
Minimum0
Maximum181
Zeros2302
Zeros (%)4.6%
Memory size390.8 KiB
2021-03-12T11:46:09.370169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q127
median110
Q3150
95-th percentile170
Maximum181
Range181
Interquartile range (IQR)123

Descriptive statistics

Standard deviation62.12982154
Coefficient of variation (CV)0.6695738181
Kurtosis-1.438478171
Mean92.7901
Median Absolute Deviation (MAD)48
Skewness-0.3237976574
Sum4639505
Variance3860.114724
MonotocityNot monotonic
2021-03-12T11:46:09.688332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14374
 
8.7%
02302
 
4.6%
21063
 
2.1%
155756
 
1.5%
154687
 
1.4%
3595
 
1.2%
153595
 
1.2%
162584
 
1.2%
156579
 
1.2%
148575
 
1.1%
Other values (172)37890
75.8%
ValueCountFrequency (%)
02302
4.6%
14374
8.7%
21063
 
2.1%
3595
 
1.2%
4433
 
0.9%
ValueCountFrequency (%)
18113
 
< 0.1%
18072
 
0.1%
179143
0.3%
178152
0.3%
177202
0.4%

month signup_date
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size390.8 KiB
1
50000 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
150000
100.0%
2021-03-12T11:46:10.317546image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-03-12T11:46:10.498848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
150000
100.0%

Most occurring characters

ValueCountFrequency (%)
150000
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number50000
100.0%

Most frequent character per category

ValueCountFrequency (%)
150000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common50000
100.0%

Most frequent character per script

ValueCountFrequency (%)
150000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII50000
100.0%

Most frequent character per block

ValueCountFrequency (%)
150000
100.0%

month last_trip_date
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.04232
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size390.8 KiB
2021-03-12T11:46:10.629258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q36
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.992879437
Coefficient of variation (CV)0.4930038781
Kurtosis-1.394615843
Mean4.04232
Median Absolute Deviation (MAD)1
Skewness-0.4290588893
Sum202116
Variance3.971568449
MonotocityNot monotonic
2021-03-12T11:46:10.897862image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
618256
36.5%
110147
20.3%
57585
15.2%
44588
 
9.2%
34568
 
9.1%
24308
 
8.6%
7548
 
1.1%
ValueCountFrequency (%)
110147
20.3%
24308
8.6%
34568
9.1%
44588
9.2%
57585
15.2%
ValueCountFrequency (%)
7548
 
1.1%
618256
36.5%
57585
15.2%
44588
 
9.2%
34568
 
9.1%

Interactions

2021-03-12T11:45:14.987453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:15.390767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:15.762129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:16.171253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:16.505866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:16.852966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:17.235259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:17.578028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:17.937453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:18.274458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:18.604269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:18.940042image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:19.283012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:19.582690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:19.891896image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:20.262740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:20.581392image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:20.922198image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:21.222391image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:21.532884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:21.849970image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:22.260432image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:22.603964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:23.020097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:23.414249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:23.756059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:24.143286image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:24.537147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:24.937275image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:25.310860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:25.686621image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:26.073659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:26.454518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:26.860813image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:27.255599image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:27.643901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:28.053744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:28.407811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:28.756202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:29.131863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:29.477040image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:29.762247image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:30.144569image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:30.472445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:30.874086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:31.251554image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:31.636802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:32.058013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:32.431733image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:32.776163image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:33.120551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:33.433786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:33.750260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:34.091123image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:34.409852image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:34.740520image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:36.839429image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:37.221169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:37.549651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:37.905219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:38.243509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:38.836746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:39.252893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:39.571857image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:39.946142image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:40.319541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:40.686135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:41.029497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:41.316247image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:41.605468image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:41.920533image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:42.269188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:42.616719image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:43.067249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:43.476810image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:43.875818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:44.362968image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:44.837909image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:45.325994image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:45.869760image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:46.208029image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:46.554677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:46.951605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:47.343349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:47.710617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:48.095129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:48.424858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:48.751311image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:49.201467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-03-12T11:45:49.583925image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-03-12T11:46:11.230455image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-12T11:46:11.820743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-12T11:46:12.434128image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-12T11:46:13.368673image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-12T11:46:14.329866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-12T11:45:50.453768image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-12T11:45:51.479553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

avg_distavg_rating_by_driveravg_rating_of_driveravg_surgephonesurge_pcttrips_in_first_30_daysluxury_car_userweekday_pctactivecity_0city_1city_2date_deltamonth signup_datemonth last_trip_date
03.675.04.7000001.101.015.44True46.210.01.00.014316
18.265.05.0000001.000.00.00False50.001.00.00.09615
20.775.04.3000001.001.00.03False100.001.00.00.0111
32.364.94.6000001.141.020.09True80.010.01.00.017016
43.134.94.4000001.190.011.814False82.400.00.01.04713
510.565.03.5000001.001.00.02True100.010.00.01.014816
63.954.04.6015591.000.00.01False100.001.00.00.0111
72.045.05.0000001.001.00.02False100.000.00.01.0111
84.365.04.5000001.000.00.02False100.000.00.01.01112
92.375.04.6015591.000.00.01False0.000.00.01.0211

Last rows

avg_distavg_rating_by_driveravg_rating_of_driveravg_surgephonesurge_pcttrips_in_first_30_daysluxury_car_userweekday_pctactivecity_0city_1city_2date_deltamonth signup_datemonth last_trip_date
499903.385.04.7000001.081.033.31True33.301.00.00.012515
499911.065.05.0000001.251.0100.00False0.010.00.01.017216
499927.585.01.0000001.001.00.01False0.000.01.00.0111
499932.534.74.8000001.111.011.13True55.611.00.00.017917
499942.254.54.6000001.441.037.51False25.001.00.00.014815
499955.634.25.0000001.001.00.00False100.010.01.00.013116
499960.004.04.6015591.001.00.01False0.001.00.00.0111
499973.865.05.0000001.000.00.00True100.000.00.01.011115
499984.583.53.0000001.001.00.02False100.001.00.00.0111
499993.495.04.6015591.000.00.00False0.001.00.00.09214

Duplicate rows

Most frequent

avg_distavg_rating_by_driveravg_rating_of_driveravg_surgephonesurge_pcttrips_in_first_30_daysluxury_car_userweekday_pctactivecity_0city_1city_2date_deltamonth signup_datemonth last_trip_datecount
60.005.05.0000001.00.00.01False100.000.00.01.01114
40.005.04.6015591.01.00.01False100.000.00.01.01113
00.001.04.6015591.00.00.01False100.000.00.01.00112
10.005.01.0000001.00.00.01False0.000.00.01.00112
20.005.04.6015591.00.00.01False100.000.00.01.01112
30.005.04.6015591.01.00.01False0.001.00.00.01112
50.005.04.6015591.01.00.01False100.001.00.00.01112
70.005.05.0000001.01.00.01False100.000.00.01.01112
80.005.05.0000001.01.00.01False100.001.00.00.00112
90.015.04.6015591.00.00.01False0.000.00.01.01112